HEAD ======= >>>>>>> f0437f6d375d27ae246a59d2ce65712203c048e5 <<<<<<< HEAD ======= >>>>>>> f0437f6d375d27ae246a59d2ce65712203c048e5
In our module we want to break down phylogentic tress and understand how R coding can help us compare them.
For starters, you will need to install {ape}, {phytools}
Objectives
Evolutionary biologists study the ways organisms change over time. This is often done by comparing several species to one another and analyzing the differences between them. These differences can then be used to build a graphical representation of species’ relationships to one another, called a phylogenetic tree. Originally, these trees were constructed based solely on organism morphology using shared derived traits, or synapomorphies, often in the form of character matrices. By focusing on these physical traits, biologists could hypothesize different life histories, like species divergence or common descent. However, using morphological data to determine taxonomic relationships is not foolproof. Some shared, analogous traits may be due to convergent evolution. Homoplasy can be difficult to detect through morphological analysis alone, and has historically resulted in phylogenetic trees that have since been disproved through the use of molecular data.
Modern biologists can now build phylogenetic trees based on the DNA of the organisms in question. By focusing on a few specific genes, evolutionary relationships can be hypothesized with greater accuracy. Examining the polymorphism at different loci can provide information on how distantly related organisms are to one another, and can reduce errors commonly associated with comparing or identifying cryptic species. Molecular data can help resolve unclear phylogenies that were created before genomic methods were available. Biologists can now revisit unresolved trees built from morphological data and revise them using DNA, allowing us to better hypothesize evolutionary relationships.
In order to assess any changes between these hypothesized evolutionary relationships, we can compare the topologies of phylogenetic trees. The topology refers to the branching pattern displayed, which represents the measure of relatedness among taxa. To understand the importance of topology, we must familiarize ourselves with the anatomy of a phylogenetic tree.
Anatomy of a tree:
Why would we compare tree topologies?
There are several different ways to estimate phylogenies, each with their own strengths, weaknesses, and appropriate situations in which to apply them. While we won’t go over them in detail, familiarize yourselves by looking at Module 24
There are MANY different phylogenetics software that we can use to employ these methods of estimation, but some of the most common are PAUP, BEAST, MrBayes, and PHYLIP. These can take genetic data (alignments) in the form of fastq files, fasta files, or NEXUS files.
Types of Tree Formats: Newick, Nexus
There are several different formats that a graphical tree can be built from.
Newick: A collection of data formatted using specific syntax that includes parentheses, commas, and semicolons to delineate weight, time, and evolutionary distance. “Newick files are simply text files that consist of one or more tree descriptions in the Newick notation. In contrast to Nexus files they contain no further syntax elements or other information than the trees.”
We will be comparing two graphical hypotheses of the genus adelpha, a group of butterflies.
One tree is morphologically based, from Keith Wilmott’s 2003 paper “Cladistic analysis of the Neotropical butterfly genus Adelpha (Lepidoptera: Nymphalidae), with comments on the subtribal classification of Limenitidini.” Our molecular tree is from Emily Ebel’s 2015 paper “Rapid diversification associated with ecological specialization in Neotropical Adelpha butterflies.”
To start off, we want to understand how a tree is created. For this example we will be making our own tree using Newick format.
Here is a great explanation of Newick format: “Put simply, monophyletic clades are surrounded by parentheses and sister clades are separated by commas. For example, a simple tree could be written as (((A,B),C),(D,E)).” Let’s try making that!
You can check that you have the most up-to-date version of R by running the command “R.Version”
R.version
## _
## platform x86_64-apple-darwin17.0
## arch x86_64
## os darwin17.0
## system x86_64, darwin17.0
## status
## major 4
## minor 1.1
## year 2021
## month 08
## day 10
## svn rev 80725
## language R
## version.string R version 4.1.1 (2021-08-10)
## nickname Kick Things
Once you know you have the correct version of R, install and load the following packages
library(ape)
library(fastmatch)
library(quadprog)
library(phangorn)
library(phytools)
## Loading required package: maps
library(geiger)
You can make sure you have the most up-to-date version of each package by using the command “packageVersion(”package-name")
packageVersion("ape")
## [1] '5.5'
Once we have all the necessary packages loaded into our markdown file, we can start playing around with building some phylogenetic trees! If you already know the relationships between the groups of species that you want to plot, and these the clade you are plotting isn’t too complex, you can simply write out the tree as a text string in Newick format! Let’s try it first with letters
text.string<-
"(((A,B),C),(D,E));"
example.tree<-read.tree(text=text.string)#this command reads trees in Newick format like we did above
plot(example.tree,no.margin=TRUE,edge.width=2)
Looks good! Now we can try it plotting a clade of whale species
text.string<-
"((((humpback wahle, fin whale), (Antarctic minke whale, common minke whale)), bowhead whale), sperm whale);"
whale.tree<-read.tree(text=text.string)
plot(whale.tree,no.margin=TRUE,edge.width=2)
There are many different commands that will allow you to visualize your tree in many different ways. Let’s try a few!
roundPhylogram(whale.tree) #creates rounded branches in tree
plot(unroot(whale.tree),type="unrooted",no.margin=TRUE,lab4ut="axial",
edge.width=2) #creates an unrooted tree
plotTree(whale.tree,type="fan",fsize=0.7,lwd=1,
ftype="i") #creates a fan tree
Load the tree files into R
=======
Adjust and manipulate tree using ggtree package to make graphics more readable
>>>>>>> f0437f6d375d27ae246a59d2ce65712203c048e5 Comparing Trees
Now that we are a little more comfortable working with phylogenetic trees in R, we can load our first tree!
First, Use this link (“https://github.com/sinnabunbun/Super-Fly-Group-Module”) to go to the Super Fly Group Module repo. From there, click on the “NJst.tre” file. Copy this data using the little pencil icon. Then go to your own working repo, select the “Create New File” option and paste the tree data into that file.
Once you have the tree file saved in your repo, you can load the tree into your R markdown file using the “read.nexus” command
mol.tree<-read.nexus(file="NJst.tre")
mol.tree
##
## Phylogenetic tree with 66 tips and 64 internal nodes.
##
## Tip labels:
## A_rothschildi, A_sichaeus, A_boreas_boreas, A_saundersii_saundersii, A_attica_attica, A_leuceroides, ...
##
## Unrooted; includes branch lengths.
Once the tree is loaded, we can try plotting it!
plotTree(mol.tree,ftype="i",fsize=0.6,lwd=2, no.margin = TRUE)
the Ntip() function will tell you how many different species (or tips) are represented in your tree
Ntip(mol.tree) ##66 species in this tree
## [1] 66
Just like with the whale tree above, we can use “unroot” to create an unrooted tree
plot(unroot(mol.tree),type="unrooted",cex=0.6,
use.edge.length=FALSE,lab4ut="axial",
no.margin=TRUE)
##unrooted tree
we can also make a fan tree
plotTree(mol.tree,type="fan",fsize=0.7,lwd=2,
ftype="i")
If you want to see all the species names, use the code mol.tree$tip.label
##all the species names
mol.tree$tip.label
## [1] "A_rothschildi" "A_sichaeus"
## [3] "A_boreas_boreas" "A_saundersii_saundersii"
## [5] "A_attica_attica" "A_leuceroides"
## [7] "A_zina_irma" "A_zina_zina"
## [9] "A_jordani" "A_justina_valentina"
## [11] "A_olynthia" "A_boeotia_boeotia"
## [13] "A_malea_aethalia" "A_delinita"
## [15] "A_heraclea" "A_naxia"
## [17] "A_capucinus_capucinus" "A_mesentina"
## [19] "A_phylaca_pseudaethalia" "A_lycorias_lara"
## [21] "A_lycorias_spruceana" "A_erotia_erotia"
## [23] "A_pollina" "A_irmina_tumida"
## [25] "A_leucophthalma_irminella" "A_cocala_cocala"
## [27] "L_lorquini" "L_weidemeyerii"
## [29] "L_arthemis_arizonensis" "L_arthemis_arthemis"
## [31] "L_archippus_floridanensis" "L_populi"
## [33] "L_sydyi" "L_amphyssa"
## [35] "L_moltrechti" "L_doerriesi"
## [37] "L_helmanni" "L_camilla"
## [39] "L_homeyeri" "L_glorifica"
## [41] "L_reducta" "A_donysa_donysa"
## [43] "A_pithys" "A_alala_negra"
## [45] "A_tracta" "A_corcyra_aretina"
## [47] "Athyma_selenophora" "Pandita_sinope"
## [49] "Sumalia_daraxa" "Parasarpa_zayla"
## [51] "Moduza_urdaneta" "A_seriphia_aquillia"
## [53] "A_seriphia_therasia" "A_serpa_celerio"
## [55] "A_melona_leucocoma" "A_cytherea_cytherea"
## [57] "A_cytherea_daguana" "A_salmoneus_colada"
## [59] "A_iphicleola_thessalita" "A_iphiclus_iphiclus"
## [61] "A_thessalia_thessalia" "A_epione_agilla"
## [63] "A_ethelda_ethelda" "A_basiloides"
## [65] "A_plesaure_phliassa" "A_shuara"
You can also add arrows to draw attention to specific species!
##add an arrow on a specific branch tip
plotTree(mol.tree,type="fan",fsize=0.7,lwd=1,
ftype="i")
add.arrow(mol.tree,tip="A_saundersii_saundersii",arrl=1)